Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems (CMU-PDL-07-105)
نویسندگان
چکیده
Cluster-based and iSCSI-based storage systems rely on standard TCP/IP-over-Ethernet for client access to data. Unfortunately, when data is striped over multiple networked storage nodes, a client can experience a TCP throughput collapse that results in much lower read bandwidth than should be provided by the available network links. Conceptually, this problem arises because the client simultaneously reads fragments of a data block from multiple sources that together send enough data to overload the switch buffers on the client’s link. This paper analyzes this Incast problem, explores its sensitivity to various system parameters, and examines the effectiveness of alternative TCPand Ethernet-level strategies in mitigating the TCP throughput collapse. Acknowledgements: We would like to thank Jeff Butler, Abbie Matthews, and Brian Mueller at Panasas Inc. for allowing us and helping us to conduct experiments on their systems. We thank the members and companies of the PDL Consortium (including APC, Cisco, EMC, Google, Hewlett-Packard, Hitachi, IBM, Intel, LSI, Network Appliance, Oracle, Seagate, and Symantec) for their interest, insights, feedback, and support. Finally, we’d like to thank Michael Stroucken for his help managing the PDL cluster, and Michael Abd-el-Malek for feedback on our work. This material is based on research sponsored in part by the National Science Foundation, via grants #CNS-0546551, #CNS-0326453 and #CCF-0621499, by the Army Research Office under agreement number DAAD19–02–1–0389, by the Department of Energy under Award Number #DE-FC0206ER25767, and by DARPA under grant #HR00110710025.
منابع مشابه
Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems
Cluster-based and iSCSI-based storage systems rely on standard TCP/IP-over-Ethernet for client access to data. Unfortunately, when data is striped over multiple networked storage nodes, a client can experience a TCP throughput collapse that results in much lower read bandwidth than should be provided by the available network links. Conceptually, this problem arises because the client simultaneo...
متن کاملA (In)Cast of Thousands: Scaling Datacenter TCP to Kiloservers and Gigabits (CMU-PDL-09-101)
This paper presents a practical solution to the problem of high-fan-in, high-bandwidth synchronized TCP workloads in datacenter Ethernets—the Incast problem. In these networks, receivers often experience a drastic reduction in throughput when simultaneously requesting data from many servers using TCP. Inbound data overfills small switch buffers, leading to TCP timeouts lasting hundreds of milli...
متن کاملUrsa Minor: Versatile Cluster-based Storage (CMU-PDL-05-104)
No single encoding scheme or fault model is optimal for all data. A versatile storage system allows them to be matched to access patterns, reliability requirements, and cost goals on a per-data item basis. Ursa Minor is a cluster-based storage system that allows data-specific selection of, and on-line changes to, encoding schemes and fault models. Thus, different data types can share a scalable...
متن کاملImprove Throughput of Storage Cluster Interconnected with a TCP/IP Network Using Intelligent Server Grouping
Cluster-based storage systems connected with TCP/IP networks are expected to achieve a high throughput by striping files across multiple storage servers. However, for the storage system interconnected with the TCP/IP network, several critical issues, like Incast effect and data access interference, invalidate the assumption that higher access parallelism always results in increased I/O throughp...
متن کاملEvaluating Multipath TCP Resilience against Link Failures
Standard TCP is the de facto reliable transfer protocol for the Internet. It is designed to establish a reliable connection using only a single network interface. However, standard TCP with single interfacing performs poorly due to intermittent node connectivity. This requires the re-establishment of connections as the IP addresses change. Multi-path TCP (MPTCP) has emerged to utilize multiple ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015